Distance dependent Chinese restaurant processes

نویسندگان

  • David M. Blei
  • Peter I. Frazier
چکیده

We develop the distance dependent Chinese restaurant process, a flexible class of distributions over partitions that allows for dependencies between the elements. This class can be used to model many kinds of dependencies between data in infinite clustering models, including dependencies arising from time, space, and network connectivity. We examine the properties of the distance dependent CRP, discuss its connections to Bayesian nonparametric mixture models, and derive a Gibbs sampler for both fully observed and latent mixture settings. We study its empirical performance with three text corpora. We show that relaxing the assumption of exchangeability with distance dependent CRPs can provide a better fit to sequential data and network data. We also show that the distance dependent CRP representation of the traditional CRP mixture leads to a faster-mixing Gibbs sampling algorithm than the one based on the original formulation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Search for Distance Dependent Chinese Restaurant Processes

The distance dependent Chinese Restaurant Processes (ddCRP), a nonparametric Bayesian model, can model distance sensitive data. Existing inference algorithms for dd-CRP, such as Markov Chain Monte Carlo (MCMC) and variational algorithms, are inefficient and unable to handle massive online data, because posterior distributions of dd-CRP are not marginal invariant. To solve this problem, we prese...

متن کامل

Tracklet clustering for robust multiple object tracking using distance dependent Chinese restaurant processes

To contrive an accurate and efficient strategy for object detection–object track assignment problem, we present a tracklet clustering approach using distance dependent Chinese restaurant processes (ddCRPs), which employ a two-level robust object tracker. The first level is an ordinary tracklet generator that obtains short yet reliable tracklets. In the second level, we cluster the tracklets ove...

متن کامل

Spatial distance dependent Chinese restaurant processes for image segmentation

The distance dependent Chinese restaurant process (ddCRP) was recently introduced to accommodate random partitions of non-exchangeable data [1]. The ddCRP clusters data in a biased way: each data point is more likely to be clustered with other data that are near it in an external sense. This paper examines the ddCRP in a spatial setting with the goal of natural image segmentation. We explore th...

متن کامل

Spectral Chinese Restaurant Processes: Nonparametric Clustering Based on Similarities

We introduce a new nonparametric clustering model which combines the recently proposed distance-dependent Chinese restaurant process (dd-CRP) and non-linear, spectral methods for dimensionality reduction. Our model retains the ability of nonparametric methods to learn the number of clusters from data. At the same time it addresses two key limitations of nonparametric Bayesian methods: modeling ...

متن کامل

A Gibbs Sampler for Spatial Clustering with the Distance-dependent Chinese Restaurant Process

The distance-dependent Chinese Restaurant Process (dd-CRP) is a flexible class of distributions over partitions which was recently introduced by [1, 2]. In their description and experiments Blei and Frazier focus on the sequential setting such as clustering over time. Their Gibbs sampler, while general in nature, does not explicitly handle the case of non-sequential (also called spatial) cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010